Goto

Collaborating Authors

 sleep stage classification


NeuroLingua: A Language-Inspired Hierarchical Framework for Multimodal Sleep Stage Classification Using EEG and EOG

Samaee, Mahdi, Yazdi, Mehran, Massicotte, Daniel

arXiv.org Artificial Intelligence

We propose NeuroLingua, a language - inspired framework that conceptualizes sleep as a structured physiological language. Each 30 - second epoch is decomposed into overlapping 3 - second subwindows ("tokens") using a CNN - based tokenizer, enabling hierarchical temporal modeling through dual - level Transformers: intra - segment encoding of local dependencies and inter - segment integration across seven consecutive epochs (3.5 minutes) for extended context. Modality - specific embeddings from EEG and EOG channels are fused via a Graph Convolutional Network, facilitating robust multimodal integration. NeuroLingua is evaluated on the Sleep - EDF Expanded and ISRUC - Sleep datasets, achieving state - of - the - art results on Sleep - EDF (85.3% accuracy, 0.800 macro F1, and 0.796 Cohen's κ), and competitive performance on ISRUC (81.9% accuracy, 0.802 macro F1, and 0.755 κ), matching or exceeding published baselines in overall and per - class metrics. The architecture's attentio n mechanisms enhance the detection of clinically relevant sleep microevents, providing a principled foundation for future interpretability, explainability and causal inference in sleep research. By framing sleep as a compositional language, NeuroLingua uni fies hierarchical sequence modeling and multimodal fusion, advancing automated sleep staging toward more transparent and clinically meaningful applications. Index Terms -- Sleep staging, EEG, EOG, Polysomnography, Deep learning, Hierarchical sequence modeling, Multimodal fusion, Transformers, Graph neural networks, Interpretability, Explainability, Causal inference.


Automated Video-EEG Analysis in Epilepsy Studies: Advances and Challenges

Zuev, Valerii A., Salmagambetova, Elena G., Djakov, Stepan N., Utkin, Lev V.

arXiv.org Artificial Intelligence

Epilepsy is typically diagnosed through electroencephalography (EEG) and long-term video-EEG (vEEG) monitoring. The manual analysis of vEEG recordings is time-consuming, necessitating automated tools for seizure detection. Recent advancements in machine learning have shown promise in real-time seizure detection and prediction using EEG and video data. However, diversity of seizure symptoms, markup ambiguities, and limited availability of multimodal datasets hinder progress. This paper reviews the latest developments in automated video-EEG analysis and discusses the integration of multimodal data. We also propose a novel pipeline for treatment effect estimation from vEEG data using concept-based learning, offering a pathway for future research in this domain.


PhysioME: A Robust Multimodal Self-Supervised Framework for Physiological Signals with Missing Modalities

Lee, Cheol-Hui, Lee, Hwa-Yeon, Jung, Min-Kyung, Kim, Dong-Joo

arXiv.org Artificial Intelligence

Missing or corrupted modalities are common in physiological signal-based medical applications owing to hardware constraints or motion artifacts. However, most existing methods assume the availability of all modalities, resulting in substantial performance degradation in the absence of any modality. To overcome this limitation, this study proposes PhysioME, a robust framework designed to ensure reliable performance under missing modality conditions. PhysioME adopts: (1) a multimodal self-supervised learning approach that combines contrastive learning with masked prediction; (2) a Dual-PathNeuroNet backbone tailored to capture the temporal dynamics of each physiological signal modality; and (3) a restoration decoder that reconstructs missing modality tokens, enabling flexible processing of incomplete inputs. The experimental results show that PhysioME achieves high consistency and generalization performance across various missing modality scenarios. These findings highlight the potential of PhysioME as a reliable tool for supporting clinical decision-making in real-world settings with imperfect data availability.


EEG Sleep Stage Classification with Continuous Wavelet Transform and Deep Learning

Gashti, Mehdi Zekriyapanah, Farjamnia, Ghasem

arXiv.org Artificial Intelligence

Accurate classification of sleep stages is crucial for the diagnosis and management of sleep disorders. Conve ntional approaches for sleep scoring rely on manual annotation or features extracted from EEG signals in the time or frequency domain. This study proposes a novel framework for automated sleep stage scoring using time - frequency analysis based on the wavele t transform. The Sleep - EDF Expanded Database (sleep - cassette recordings) was used for evaluation. The continuous wavelet transform (CWT) generated time - frequency maps that capture both transient and oscillatory patterns across frequency bands relevant to s leep staging. Experimental results demonstrate that the proposed wavelet - based representation, combined with ensemble learning, achieves an overall accuracy of 88.37% and a macro - averaged F1 score of 73.15%, outperforming conventional machine learning meth ods and exhibiting comparable or superior performance to recent deep learning approaches. ABSTRACT MUST Journal of Research and Development (MJRD) Volume 6 Issue 3, September 2025 e ISSN 2683 - 6467 & p ISSN 2683 - 6475 429 1.0 Introduction Sleep is a vital physiological process essential for memory consolidation, learning, and overall brain health. Sleep disruptions are strongly associated with a wide range of neurological and psychiatric conditions, including epilepsy, Alzheimer's disease, depression, and t raumatic brain injury (Kang et al .



Multi-Channel Differential Transformer for Cross-Domain Sleep Stage Classification with Heterogeneous EEG and EOG

Chin, Benjamin Wei Hao, Yew, Yuin Torng, Wu, Haocheng, Liang, Lanxin, Chan, Chow Khuen, Zain, Norita Mohd, Samdin, Siti Balqis, Goh, Sim Kuan

arXiv.org Artificial Intelligence

Classification of sleep stages is essential for assessing sleep quality and diagnosing sleep disorders. However, manual inspection of EEG characteristics for each stage is time-consuming and prone to human error. Although machine learning and deep learning methods have been actively developed, they continue to face challenges arising from the non-stationarity and variability of electroencephalography (EEG) and electrooculography (EOG) signals across diverse clinical configurations, often resulting in poor generalization. In this work, we propose SleepDIFFormer, a multi-channel differential transformer framework for heterogeneous EEG-EOG representation learning. SleepDIFFormer is trained across multiple sleep staging datasets, each treated as a source domain, with the goal of generalizing to unseen target domains. Specifically, it employs a Multi-channel Differential Transformer Architecture (MDTA) designed to process raw EEG and EOG signals while incorporating cross-domain alignment. Our approach mitigates spatial and temporal attention noise and learns a domain-invariant EEG-EOG representation through feature distribution alignment across datasets, thereby enhancing generalization to new domains. Empirically, we evaluated SleepDIFFormer on five diverse sleep staging datasets under domain generalization settings and benchmarked it against existing approaches, achieving state-of-the-art performance. We further conducted a comprehensive ablation study and interpreted the differential attention weights, demonstrating their relevance to characteristic sleep EEG patterns. These findings advance the development of automated sleep stage classification and highlight its potential in quantifying sleep architecture and detecting abnormalities that disrupt restorative rest. Our source code and checkpoint are made publicly available at https://github.com/Ben1001409/SleepDIFFormer


MetaSTH-Sleep: Towards Effective Few-Shot Sleep Stage Classification for Health Management with Spatial-Temporal Hypergraph Enhanced Meta-Learning

Li, Jingyu, Zhang, Tiehua, Wang, Jinze, Zhang, Yi, Li, Yuhuan, Zhao, Yifan, Shen, Zhishu, Wu, Libing, Liu, Jiannan

arXiv.org Artificial Intelligence

Accurate classification of sleep stages based on bio-signals is fundamental not only for automatic sleep stage annotation, but also for clinical health management and continuous sleep monitoring. Traditionally, this task relies on experienced clinicians to manually annotate data, a process that is both time-consuming and labor-intensive. In recent years, deep learning methods have shown promise in automating this task. However, three major challenges remain: (1) deep learning models typically require large-scale labeled datasets, making them less effective in real-world settings where annotated data is limited; (2) significant inter-individual variability in bio-signals often results in inconsistent model performance when applied to new subjects, limiting generalization; and (3) existing approaches often overlook the high-order relationships among bio-signals, failing to simultaneously capture signal heterogeneity and spatial-temporal dependencies. To address these issues, we propose MetaSTH-Sleep, a few-shot sleep stage classification framework based on spatial-temporal hypergraph enhanced meta-learning. Our approach enables rapid adaptation to new subjects using only a few labeled samples, while the hypergraph structure effectively models complex spatial interconnections and temporal dynamics simultaneously in EEG signals. Experimental results demonstrate that MetaSTH-Sleep achieves substantial performance improvements across diverse subjects, offering valuable insights to support clinicians in sleep stage annotation.


Energy-Efficient Real-Time 4-Stage Sleep Classification at 10-Second Resolution: A Comprehensive Study

Mohammadi, Zahra, Fazel, Parnian, Mohammadi, Siamak

arXiv.org Artificial Intelligence

--Sleep stage classification plays a crucial role in health monitoring, particularly for diagnosing and managing sleep disorders such as sleep apnea and insomnia. However, conventional clinical approaches like polysomnography are often costly, inconvenient, and impractical for long-term, home-based monitoring. In this study, we present an energy-efficient classification approach for detecting four sleep stages--wake, rapid eye movement (REM), light sleep, and deep sleep--using a single-lead electrocardiogram (ECG) signal. We evaluate and compare the performance of various machine-learning and deep-learning models. T o support this, we introduce two novel windowing strategies: (1) a 5-minute window with 30-second steps for machine-learning models utilizing handcrafted features, and (2) a 30-second window with 10-second steps for deep-learning models, enabling near-real-time predictions with 10-second temporal resolution. Although lightweight, our deep-learning models--such as MobileNet-v1--achieve high classification performance (up to 92% accuracy and 91% F1-score), their energy demands remain high, making them sub-optimal for wearable applications. T o address this, we design a SleepLiteCNN optimized specifically for ECG-based sleep staging. T o further enhance efficiency, we apply 8-bit quantization, which leaves classification performance unchanged, reducing the energy usage of our SleepLiteCNN to just 5.48 µJ per inference at a 45 nm technology node, with 90% accuracy and 90% F1-score. We further demonstrate that deploying this SleepLiteCNN on a field-programmable gate array (FPGA) significantly reduces resource usage through quantization. Overall, this approach provides a practical and efficient solution for continuous ECG-based sleep monitoring in compact, resource-constrained wearable devices. LEEP Sleep stage classification is crucial in health monitoring, especially for diagnosing and managing sleep disorders such as sleep apnea and insomnia [2]. Sleep stages are categorized into wake, REM, and Non-Rapid Eye Movement (NREM) stages. Z. Mohammadi is a Ph.D. Candidate in the School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran (e-mail: zahramo-hammmadi@ut.ac.ir). Fazel is a Graduate of the School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran (e-mail: parnian.fazel@ut.ac.ir).


PSG-MAE: Robust Multitask Sleep Event Monitoring using Multichannel PSG Reconstruction and Inter-channel Contrastive Learning

Wang, Yifei, Liu, Qi, Min, Fuli, Wang, Honghao

arXiv.org Artificial Intelligence

Polysomnography (PSG) signals are essential for studying sleep processes and diagnosing sleep disorders. Analyzing PSG data through deep neural networks (DNNs) for automated sleep monitoring has become increasingly feasible. However, the limited availability of datasets for certain sleep events often leads to DNNs focusing on a single task with a single-sourced training dataset. As a result, these models struggle to transfer to new sleep events and lack robustness when applied to new datasets. To address these challenges, we propose PSG-MAE, a mask autoencoder (MAE) based pre-training framework. By performing self-supervised learning on a large volume of unlabeled PSG data, PSG-MAE develops a robust feature extraction network that can be broadly applied to various sleep event monitoring tasks. Unlike conventional MAEs, PSG-MAE generates complementary masks across PSG channels, integrates a multichannel signal reconstruction method, and employs a self-supervised inter-channel contrastive learning (ICCL) strategy. This approach enables the encoder to capture temporal features from each channel while simultaneously learning latent relationships between channels, thereby enhancing the utilization of multichannel information. Experimental results show that PSG-MAE effectively captures both temporal details and inter-channel information from PSG signals. When the encoder pre-trained through PSG-MAE is fine-tuned with downstream feature decomposition networks, it achieves an accuracy of 83.7% for sleep staging and 90.45% for detecting obstructive sleep apnea, which highlights the framework's robustness and broad applicability.


Toward Foundational Model for Sleep Analysis Using a Multimodal Hybrid Self-Supervised Learning Framework

Lee, Cheol-Hui, Kim, Hakseung, Yoon, Byung C., Kim, Dong-Joo

arXiv.org Artificial Intelligence

Sleep is essential for maintaining human health and quality of life. Analyzing physiological signals during sleep is critical in assessing sleep quality and diagnosing sleep disorders. However, manual diagnoses by clinicians are time-intensive and subjective. Despite advances in deep learning that have enhanced automation, these approaches remain heavily dependent on large-scale labeled datasets. This study introduces SynthSleepNet, a multimodal hybrid self-supervised learning framework designed for analyzing polysomnography (PSG) data. SynthSleepNet effectively integrates masked prediction and contrastive learning to leverage complementary features across multiple modalities, including electroencephalogram (EEG), electrooculography (EOG), electromyography (EMG), and electrocardiogram (ECG). This approach enables the model to learn highly expressive representations of PSG data. Furthermore, a temporal context module based on Mamba was developed to efficiently capture contextual information across signals. SynthSleepNet achieved superior performance compared to state-of-the-art methods across three downstream tasks: sleep-stage classification, apnea detection, and hypopnea detection, with accuracies of 89.89%, 99.75%, and 89.60%, respectively. The model demonstrated robust performance in a semi-supervised learning environment with limited labels, achieving accuracies of 87.98%, 99.37%, and 77.52% in the same tasks. These results underscore the potential of the model as a foundational tool for the comprehensive analysis of PSG data. SynthSleepNet demonstrates comprehensively superior performance across multiple downstream tasks compared to other methodologies, making it expected to set a new standard for sleep disorder monitoring and diagnostic systems.